Leveraging Underutilized Risk-Adjusted Metrics to Create Targeted Trauma PI Efforts
Risk-Adjusted Mortality Measurement
Nicolas Foss, Ed.D., MS
2026-02-26
Objectives
Explain the benefits of using robust risk-adjusted metrics to analyze trauma patient cohorts
Use common tools to compute both the predicted probability of survival and the W score
Identify ways to use these metrics to inform performance improvement efforts in trauma care
Accessing this presentation
Risk Adjustment ✅
Raw mortality rates can mislead comparisons between hospitals.
Example: Hospital A has a raw mortality rate of 5%, while Hospital B has a rate of 10%. At first glance, it seems Hospital A is performing better.
However, Hospital B treats more older adults and severe traumas (i.e. patients with higher mortality rates).
Risk Adjustment ✅
Basic data example
Hospital
Raw_Mortality_Rate
A
5%
B
10%
Risk Adjustment ✅
Centers treat different mixes of patients — older adults, severe trauma, or complex injuries.
Example: Hospital A primarily treats young, healthy individuals with minor injuries.
Hospital B treats older adults with complex injuries.
Without adjusting for these differences, comparing their raw mortality rates would be unfair.
Risk Adjustment ✅
Patient data example
Hospital
Age_Group
Injury_Severity
A
Young
Minor
B
Older Adults
Severe
Risk Adjustment ✅
Fair benchmarking requires adjustment for patient risk.
Example: By using risk-adjusted metrics, we can account for the differences in patient populations. Considering the higher risk of their patients, it may actually be performing better than we thought, and not significantly different from Hospital A.
Risk Adjustment ✅
Example of adjusted rates
Hospital
Adjusted_Mortality_Rate
A
6%
B
7%
Risk Adjustment ✅
The goal: compare performance given who each center treats, not just raw outcomes.
Example: After adjusting for patient risk, we find that Hospital B’s performance is on par with Hospital A, despite the higher raw mortality rate. This adjustment provides a more accurate and fair comparison of hospital performance.
Risk Adjustment ✅
Example of raw and adjusted together
Hospital
Raw_Mortality_Rate
Adjusted_Mortality_Rate
A
5%
6%
B
10%
7%
Concept of Risk Adjustment
For each patient, we estimate a predicted probability of survival, \(P(\text{Survival})\).
The estimate depends on key variables:
Injury Severity Score (ISS)
Revised Trauma Score (RTS)
Age Index (<= 54, > 54)
Mechanism (Blunt vs. Penetrating)
Probability of survival
Modern trauma registries and EHRs will do this calculation for you. Blunt and penetrating injuries use different coefficients to estimate the predicted probabilities. The survival prognosis is computed based on a logistic regression equation of the form: \[
\text{Survival Probability} = \frac{1}{1 + e^{-b}}
\]
where \[
b = \beta_{0} + \beta_{1} \times \text{RTS} + \beta_{2} \times \text{ISS} + \beta_{3} \times \text{AgeIndex}
\]
Understanding \(P(S)\)
Probability of Survival: The chance a patient will survive.
Formula: Combines factors to predict survival.
Simplified: Uses a specific calculation to transform these factors into a probability between 0 (no chance) and 1 (certain survival).
From Individual to System-Level Benchmark
Compute \(P(\text{Survival})\) for each patient.
Sum these probabilities across all patients -> Expected Survivors.
Sum \(1 - P(\text{Survival})\) across all patients -> Expected Decedents.
Compare to Observed Survivors and Decedents from actual outcomes.
Why This is a Benchmarking Standard
The ACS Trauma Quality Improvement Program (TQIP) uses similar models for national benchmarking.
This approach adjusts for injury severity, physiology, and demographics.
Why This is a Benchmarking Standard
It provides a fair, risk-adjusted performance measure.
But, how do I calculate risk-adjusted metrics, what are they?
#mathishard
W-Score
The W-score quantifies how a trauma center performs relative to expected outcomes.
It expresses the difference between observed and expected survivors, scaled to patient volume.
\[
W = \frac{A - B}{C} \times 100
\]
W-Score
Where:
\(A\) = Total number of patients with all data necessary to calculate \(P(Survival)\)minus the number of those patients who died
\(B\) = Sum of all predicted survival probabilities \(P(Survival)\) for this patient group
\(C\) = Total number of patients with all data necessary to calculate \(P(Survival)\)
W-Score
Interpretation for clinicians:
\(W > 0\) -> More survivors than expected; center performing better than average
\(W < 0\) -> Fewer survivors than expected; center performing worse than average
Provides a volume-adjusted, risk-adjusted measure similar in purpose to RMM.
Example: W-Score Calculation
Let:
\(n = 900\) total patients
\(n_{\text{deaths}} = 40\) deaths
\(\sum P(Survival) = 750.3638\) (sum of predicted survivals)
Step 1: Compute observed survivors
\[
A = n - n_{\text{deaths}}
\]
\[
A = 900 - 40 = 860
\]
Step 2: Define expected survivors
\[
B = \sum P(Survival) = 750.3638
\]
Step 3: Apply W-score formula
\[
W = \frac{A - B}{C} \times 100
\]
Substitute known values:
\[
W = \frac{860 - 750.3638}{900} \times 100
\]
Step 4: Compute W-score
\[
W = \frac{109.6362}{900} \times 100 = 12.18
\]
Step 5: Inference
\(W = 12.18\)
-> The center achieved about 12 more survivors per 100 patients than expected.
Indicates better-than-expected performance after adjusting for patient risk.
W Score is limited
The W Score method is derived from the MTOS study, which was undergirded by linear methods
Divides patients into bins of equal width based on predicted survival probability, \(P(Survival)\).
Assumes that \(P(Survival)\) is evenly distributed.
W Score is limited
Problem:\(P(Survival)\) from logistic regression is not normally distributed — many patients cluster near very high or very low survival probabilities.
Linear bins overrepresent some risk groups and underrepresent others, which can distort observed vs expected comparisons.
Distribution of Predicted Survival
Empirical data show that trauma patients are not evenly distributed across predicted survival probabilities.
Most patients presenting to trauma centers have a very high likelihood of survival.
MTOS Distribution
Ps Range
Proportion of Patients
0.96 – 1.00
0.842
0.91 – 0.95
0.053
0.76 – 0.90
0.052
0.51 – 0.75
0.000
0.26 – 0.50
0.043
0.00 – 0.25
0.010
W-Score Can Be Misleading
The W-score is heavily influenced by the majority of patients with very high\(P(Survival)\) values (for example, \(P(Survival) > 0.8\).
Because most trauma patients are expected to survive, the W-score often reflects performance among the least acute patients, not those at highest risk.
W-Score Can Be Misleading
This means two centers could have identical W-scores even if one performs much better with severely injured patients.
Assumption vs. Reality
The W-score assumes that \(P(Survival)\) values are linearly distributed among patients across the 0–1 range.
However, observed data show that \(P(Survival)\) is highly skewed, with most patients near 1.0.
Therefore, linear bins or evenly spaced \(P(Survival)\) categories overweight low-acuity patients and underweight critical cases.
Take-Home Message on the W Score
W-score alone provides a partial picture of trauma center performance.
For a fair comparison, models such as the Relative Mortality Metric (RMM) use non-linear binning that reflects the true, non-normal\(P(Survival)\) distribution observed in real trauma data.
Relative Mortality Metric (RMM)
Napoli et al. (2017)
The RMM is a risk-adjusted metric that compares observed mortality to predicted mortality.
It accounts for patient-level severity, physiology, and demographics using previously validated coefficients.
Relative Mortality Metric (RMM)
Positive RMM -> higher-than-expected survival.
Negative RMM -> lower-than-expected survival.
Helps benchmark trauma center performance fairly.
Non-Linear Binning: Why It Matters
Because \(P(Survival)\) is skewed, non-linear bins capture the distribution more accurately.
Examples of non-linear binning:
Quantiles (equal number of patients per bin)
Clinically meaningful thresholds (e.g., very high risk vs moderate vs low)
Non-Linear Binning: Why It Matters
This allows fairer comparison of observed vs expected outcomes across risk groups.
Ensures that the benchmarking metrics (RMM, W-score) reflect actual patient risk rather than arbitrary binning.
Interpretation for clinicians:
- Positive RMM -> observed mortality is lower than expected, better performance.
- Negative RMM -> observed mortality is higher than expected, worse performance.
- Weighted binning ensures fair comparison across different patient risk levels.
- Easy interpretation on a scale from -1 (bad) to 1 (great), where 0 is “met expectations”.
Why Bin Weighting Matters
Predicted survival probabilities \(P(Survival)\) are not evenly distributed — most patients may cluster at high or low survival.
Using weighted bins ensures that each risk group contributes appropriately to the RMM.
This prevents over- or under-representation of patient subgroups in the metric.
RMM thus provides a clinically meaningful, risk-adjusted benchmark for trauma center performance.
Key Takeaways for Clinicians on RMM
RMM and W-score are risk-adjusted metrics, accounting for patient severity and demographics.
M-score linear binning can be misleading because predicted survival probabilities are skewed.
Non-linear binning improves interpretation, particularly for observed vs expected mortality analyses.
Key Takeaways for Clinicians on RMM
Using these methods allows trauma centers to compare performance fairly and identify opportunities for improvement.
Let’s see some examples of RMM in action
Iowa!!!
Data used for RMM calculations in Iowa
Sample data
Removed all missings for the larger dataset
Lowest Ps value and found if a patient ever died per injury event
n = 101,194 patient encounter sample
Data used for RMM calculations in Iowa
State-level RMM 2020-2024
Digging deeper
RMM by Trauma Type
RMM by Age Group
RMM by Biological Sex
A job well done
From 2020-2024, we expected 3,929 deaths.
We observed 2,377 deaths
Overall, Iowa trauma centers saved 1,552 trauma patients that were predicted to die from 2020-2024.
Takeaways
It is not enough to simply review raw survival/mortality outcomes
Unadjusted calculation of outcomes will only skew your statistical inference.
Mathematically, the field has come far to provide robust solutions for good statistical inference.
Risk adjustment is not hard to access, given ample free and open source software (FOSS)
Analyses
At BEMTS, we have been hard at work creating open source software that benefits Iowans and other jurisdictions.
{traumar} package page
Questions?
Thanks!
Nicolas Foss, Ed.D., MS
Epidemiologist
Bureau of Emergency Medical and Trauma Services
Bureau of Health Statistics
Division of Public Health > Iowa HHS
C: 515.985.9627 || E: nicolas.foss at hhs.iowa.gov